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DETAILED ACTION 
Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including tine fee set 
forth in 37 CFR 1 .17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1 .17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 
06/13/2008 has been entered. 
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Response to Arguments 

2. Applicants arguments witli respect to claims 1-22 have been considered but are 
moot in view of the new grounds of rejection. Examiner has withdrawn the reference of 
Lew and has incorporated a new reference: Pillay et al. US 20030195645 A1 
(hereinafter Pillay). Pillay teaches the extraction of data from a biphase encoded audio 
stream using a time window, wherein subframes and preambles are present within the 
audio stream. Pillay also teaches the estimation of bit sample length as well as 
extraction of audio data from an audio stream, wherein a time window is used. Pillay 
accomplishes the same limitations disclosed within claims 1-22, even with the use of 
PLL's to decode a biphase signal and extract frames. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the phor art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claim 1-4, 8, 11-18, and 20-22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Lew, US 5245667 A (hereinafter Lew) in view of Pillay et al. US 
20030195645 Al (hereinafter Pillay) and further in view of Gillick et al US 4837831 A 
(hereinafter Gillick). 



Application/Control Number: 10/519,000 Page 4 

Art Unit: 2626 

Re claims 1 ,and 1 1 , Lew teaches a method of extracting digital audio data words 
from a serialized stream of digital audio data (Col. 4 lines 34-68 & Fig. 2), comprising: 

constructing a timing window from an estimated bit time for said serialized 
stream of digital audio data, said timing window having a preamble sub-window (Col. 6 
lines 16-32) and at least one data sub-window (Col. 4 lines 34-68 & Fig. 2); 

extracting plural digital audio data words from said serialized stream of digital 
audio (col. 1 line 22-28) based upon the location of each transition in said serialized 
stream of digital audio data relative to said preamble sub-window (Col. 6 lines 16-32) 
and said at least one data sub-window of said timing window (Col. 4 lines 34-68 & Fig. 
2); 

each one of said extracted plural digital audio data words having a preamble 
identifiable by a combination of at least one transition located in said preamble sub- 
window (Col. 6 lines 16-32) of said timing window and at least one transition located in 
said at least one data sub-window of said timing window. 

However, Lew fails to teach a preamble identifiable by a combination of at least 
one transition located in said preamble sub-window of said timing window and at least 
one transition located in said at least one data sub-window of said timing window. 

Pillay teaches a method of extracting a clock from a biphase encoded bit stream 
includes the step of detecting a stream of samples each having a sample size 
measured between consecutive bit phase transitions. A sample length is determined for 
each sample, the sample length approximating a number of least common multiples in 
the corresponding sample size. A preamble is detected from the sample lengths of a 
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sequence of the samples and decoded to determine an expected logic level of the clock 
following a transition at an expected clock edge. The expected level of the clock is 
gated with the biphase encoded data to generate a control signal in advance of the 
opening of the time window (Pillay Abstract). 

Additionally, Pillay teaches a logic level transition of the data bit at each active 
edge of the bit clock, otherwise it is considered to be an error in the encoding scheme. 
(With the exception that preambles by definition include one or more of such biphase 
errors.) FIG. 7 illustrates a portion of a typical AES/EBU (SPD/IF) data stream. The 
stream is divided into blocks each composed of 192 frames (Frames 0-191). Each 
frame in turn is composed a pairof subframes, each including Channel A and Channel 
B data, along with one of three types of 4-bit preambles. An X preamble precedes each 
Channel A subframe (except at the beginning of the block), a Y preamble precedes 
each Channel B subframe and a Z preamble precedes each Channel A subframe at the 
beginning of the block (Pillay [0061]). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew to incorporate a preamble identifiable 
by a combination of at least one transition located in said preamble sub-window of said 
timing window and at least one transition located in said at least one data sub-window 
of said timing window as taught by Pillay to allow for the reduction of biphase errors 
when extracting and generating clock and control signals, wherein frames, subframes, 
and preambles are present within the time window (Pillay [0061]). 
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However, Lew in view of Pillay fails to teach extracted plural digital audio data 

words. 

Gillick teaches that the acquisition of multiple utterances of each vocabulary 
word, method 100 advances to step 106. This step performs a plurality of substeps 
1 08, 11 0, and 1 1 2 for each word in the vocabulary. The first of these substeps, step 
1 08, itself comprises two substeps, 1 1 4 and 1 1 6, which are performed for each 
utterance of each word. Step 114 finds an anchor for each utterance, that is, the first 
location in the utterance at which it has attained a certain average threshold amplitude. 
Step 116 calculates five smoothed frames for each utterance, positioned relative to Its 
anchor. Additionally, Gillick teaches in reference to figures 4 and 5, where FIG. 4 
schematically represents how such smoothed frames are calculated. A smoothed 
frame 118 is calculated from five individual frames 104A-104E, of the type described 
above with regard to FIG. 3. According to this process, each pair of successive 
Individual frames 104 are averaged, to form one second level frame 120. Thus the 
individual frames 104A and 104B are averaged to form the second level frame 120A, 
and the individual frames 104B and 104C are averaged to form the second level frame 
120B, and so on, as Is shown In FIG. 4 (Gillick Col. 8 lines 5-37). 

Therefore, It would have been obvious to one of ordinary skill In the art at the 
time of the Invention to modify the system of Lew in view of Pillay to incorporate 
extracted plural digital audio data words within an audio stream as taught by Gillick to 
allow for the detection and location of multiple utterances within an audio signal, 
whererin the repetition of an utterance is not limited to adjacent words and can have a 
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separation between the repeated words, where transitions from a repeated word to the 
next can be smoothed and the location of occurrence can be stored in memory as to 
detect a repeating segment of multiple words (i.e. chorus) (Gillick Col. 8 lines 5-37). 



Re claims 2 and 15, Lew teaches a pair of successive transitions (Col. 8 lines 24- 
42) located in said preamble sub-window followed by a pair of successive transitions 
located in said at least one data sub-window (Col. 4 lines 34-68 & Fig. 2). 

However, Lew fails to teach a preamble sub-window followed by a pair of 
successive transitions 

Pillay teaches a method of extracting a clock from a biphase encoded bit stream 
includes the step of detecting a stream of samples each having a sample size 
measured between consecutive bit phase transitions. A sample length is determined for 
each sample, the sample length approximating a number of least common multiples in 
the corresponding sample size. A preamble is detected from the sample lengths of a 
sequence of the samples and decoded to determine an expected logic level of the clock 
following a transition at an expected clock edge. The expected level of the clock is 
gated with the biphase encoded data to generate a control signal in advance of the 
opening of the time window (Pillay Abstract). 

Additionally, Pillay teaches a logic level transition of the data bit at each active 
edge of the bit clock, otherwise it is considered to be an error in the encoding scheme. 
(With the exception that preambles by definition include one or more of such biphase 
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errors.) FIG. 7 illustrates a portion of a typical AES/EBU (SPD/IF) data stream. The 
stream is divided into blocks each composed of 192 frames (Frames 0-191). Each 
frame in turn is composed a pair of subframes, each including Channel A and Channel 
B data, along with one of three types of 4-bit preambles. An X preamble precedes each 
Channel A subframe (except at the beginning of the block), a Y preamble precedes 
each Channel B subframe and a Z preamble precedes each Channel A subframe at the 
beginning of the block (Pillay [0061]). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew to incorporate preamble sub-window 
followed by a pair of successive transitions located in said at least one data sub-window 
as taught by Pillay to allow for the reduction of biphase errors when extracting and 
generating clock and control signals, wherein frames, subframes, and preambles are 
present within the time window (Pillay [0061]). 

However, Lew in view of Pillay fails to teach the method of claim 1 , and further 
comprising identifying said extracted data words as having a first type of preamble if 
said extracted data words have a pair of successive transitions. 

Gillick teaches that the acquisition of multiple utterances of each vocabulary 
word, method 100 advances to step 106. This step performs a plurality of substeps 
1 08, 1 1 0, and 1 1 2 for each word in the vocabulary. The first of these substeps, step 
108, itself comprises two substeps, 114 and 116, which are performed for each 
utterance of each word. Step 114 finds an anchor for each utterance, that is, the first 
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location in the utterance at which it has attained a certain average threshold amplitude. 
Step 116 calculates five smoothed frames for each utterance, positioned relative to its 
anchor. Additionally, Gillick teaches in reference to figures 4 and 5, where FIG. 4 
schematically represents how such smoothed frames are calculated. A smoothed 
frame 1 18 is calculated from five individual frames 104A-104E, of the type described 
above with regard to FIG. 3. According to this process, each pair of successive 
individual frames 104 are averaged, to form one second level frame 120. Thus the 
individual frames 104A and 104B are averaged to form the second level frame 120A, 
and the individual frames 104B and 104C are averaged to form the second level frame 
120B, and so on, as is shown in FIG. 4 (Gillick Col. 8 lines 5-37). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew in view of Pillay to incorporate 
extracted plural digital audio data words within an audio stream as taught by Gillick to 
allow for the detection and location of multiple utterances within an audio signal, 
whererin the repetition of an utterance is not limited to adjacent words and can have a 
separation between the repeated words, where transitions from a repeated word to the 
next can be smoothed and the location of occurrence can be stored in memory as to 
detect a repeating segment of multiple words (i.e. chorus) (Gillick Col. 8 lines 5-37). 

Re claims 3 and 16, Lew teaches preamble sub-window (Col. 6 lines 16-32) 
separated by a pair of successive transitions located in said at least one data sub- 
window . 
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However, Lew fails to teach the method of claim 2, and further comprising 
identifying said extracted data words as having a second type of preamble if said 
extracted data words have a pair of non-successive transitions. 

Pillay teaches a method of extracting a clock from a biphase encoded bit stream 
includes the step of detecting a stream of samples each having a sample size 
measured between consecutive bit phase transitions. A sample length is determined for 
each sample, the sample length approximating a number of least common multiples in 
the corresponding sample size. A preamble is detected from the sample lengths of a 
sequence of the samples and decoded to determine an expected logic level of the clock 
following a transition at an expected clock edge. The expected level of the clock is 
gated with the biphase encoded data to generate a control signal in advance of the 
opening of the time window (Pillay Abstract). 

Additionally, Pillay teaches a logic level transition of the data bit at each active 
edge of the bit clock, otherwise it is considered to be an error in the encoding scheme. 
(With the exception that preambles by definition include one or more of such biphase 
errors.) FIG. 7 illustrates a portion of a typical AES/EBU (SPD/IF) data stream. The 
stream is divided into blocks each composed of 192 frames (Frames 0-191). Each 
frame in turn is composed a pair of subframes, each including Channel A and Channel 
B data, along with one of three types of 4-bit preambles. An X preamble precedes each 
Channel A subframe (except at the beginning of the block), a Y preamble precedes 
each Channel B subframe and a Z preamble precedes each Channel A subframe at the 
beginning of the block (Pillay [0061]). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew to incorporate identifying said 
extracted data words as having a second type of preamble if said extracted data words 
have a pair of non-successive transitions as taught by Pillay to allow for the reduction of 

biphase errors when extracting and generating clock and control signals, wherein 
frames, subframes, and preambles are present within the time window (Pillay [0061]). 

Gillick teaches that the acquisition of multiple utterances of each vocabulary 
word, method 100 advances to step 106. This step performs a plurality of substeps 
1 08, 1 1 0, and 1 1 2 for each word in the vocabulary. The first of these substeps, step 
108, itself comprises two substeps, 114 and 116, which are performed for each 
utterance of each word. Step 114 finds an anchor for each utterance, that is, the first 
location in the utterance at which it has attained a certain average threshold amplitude. 
Step 116 calculates five smoothed frames for each utterance, positioned relative to its 
anchor. Additionally, Gillick teaches in reference to figures 4 and 5, where FIG. 4 
schematically represents how such smoothed frames are calculated. A smoothed 
frame 118 is calculated from five individual frames 104A-104E, of the type described 
above with regard to FIG. 3. According to this process, each pair of successive 
individual frames 104 are averaged, to form one second level frame 120. Thus the 
individual frames 104A and 104B are averaged to form the second level frame 120A, 
and the individual frames 104B and 104C are averaged to form the second level frame 
120B, and so on, as is shown in FIG. 4 (Gillick Col. 8 lines 5-37). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew in view of Pillay to incorporate 
extracted plural digital audio data words within an audio stream as taught by Gillick to 
allow for the detection and location of multiple utterances within an audio signal, 
whererin the repetition of an utterance is not limited to adjacent words and can have a 
separation between the repeated words, where transitions from a repeated word to the 
next can be smoothed and the location of occurrence can be stored in memory as to 
detect a repeating segment of multiple words (i.e. chorus) (Gillick Col. 8 lines 5-37). 

Re claims 4 and 17, Lew teaches the method of claim 3, and further comprising 
identifying said extracted data words as having a third type of preamble (Col. 6 lines 16- 

32) if said extracted data words have a transition located in said preamble sub-window 
followed by first, second and third transitions located in said at least one data sub- 
window. 

However, Lew fails to teach a third type of preamble if said extracted data words 
have a transition located in said preamble sub-window followed by first, second and 
third transitions located in said at least one data sub-window 

Pillay teaches a method of extracting a clock from a biphase encoded bit stream 
includes the step of detecting a stream of samples each having a sample size 
measured between consecutive bit phase transitions. A sample length is determined for 
each sample, the sample length approximating a number of least common multiples in 
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the corresponding sample size. A preamble is detected from the sample lengths of a 
sequence of the samples and decoded to determine an expected logic level of the clock 
following a transition at an expected clock edge. The expected level of the clock is 
gated with the biphase encoded data to generate a control signal in advance of the 
opening of the time window (Pillay Abstract). 

Additionally, Pillay teaches a logic level transition of the data bit at each active 
edge of the bit clock, otherwise it is considered to be an error in the encoding scheme. 
(With the exception that preambles by definition include one or more of such biphase 
errors.) FIG. 7 illustrates a portion of a typical AES/EBU (SPD/IF) data stream. The 
stream Is divided into blocks each composed of 1 92 frames (Frames 0-1 91 ). Each 
frame in turn is composed a pair of subframes, each including Channel A and Channel 
B data, along with one of three types of 4-bit preambles. An X preamble precedes each 
Channel A subframe (except at the beginning of the block), a Y preamble precedes 
each Channel B subframe and a Z preamble precedes each Channel A subframe at the 
beginning of the block (Pillay [0061]). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew to Incorporate a third type of preamble 
If said extracted data words have a transition located in said preamble sub-window 
followed by first, second and third transitions located in said at least one data sub- 
window as taught by Pillay to allow for the reduction of biphase errors when extracting 
and generating clock and control signals, wherein frames, subframes, and preambles 
are present within the time window (Pillay [0061]). 
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Re claims 8 and 18, Lew teaclies tine metliod of claim 1 , wherein said estimated 
bit time is derived from said serialized stream of digital audio data (Col. 4 lines 34-68 & 
Fig. 2). 

Re claims 12 and 13, Lew teaches the method of claim 1 1 , wherein said fast 
sample rate is at least about twenty times faster than a data rate for said serialized 
stream of digital audio data (Lew Col. 5 line 44-53). 

Re claims 14 and 21 , Lew teaches the method of claim 13, wherein each one of 
said extracted plural digital audio data words has a preamble (Col. 6 lines 16-32) 
identifiable by a combination of at least one transition located in said preamble sub- 
window (Col. 6 lines 16-32) of said timing window-and at least one transition located in 
said at least one data sub-window of said timing window (Col. 4 lines 34-68 & Fig. 2). 

However, Lew fails to teach at least one transition located in said preamble sub- 
window of said timing window-and at least one transition located in said at least one 
data sub-window of said timing window. 

Pillay teaches a method of extracting a clock from a biphase encoded bit stream 
includes the step of detecting a stream of samples each having a sample size 
measured between consecutive bit phase transitions. A sample length is determined for 
each sample, the sample length approximating a number of least common multiples in 
the corresponding sample size. A preamble is detected from the sample lengths of a 
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sequence of the samples and decoded to determine an expected logic level of the clock 
following a transition at an expected clock edge. The expected level of the clock is 
gated with the biphase encoded data to generate a control signal in advance of the 
opening of the time window (Pillay Abstract). 

Additionally, Pillay teaches a logic level transition of the data bit at each active 
edge of the bit clock, otherwise it is considered to be an error in the encoding scheme. 
(With the exception that preambles by definition include one or more of such biphase 
errors.) FIG. 7 illustrates a portion of a typical AES/EBU (SPD/IF) data stream. The 
stream is divided into blocks each composed of 192 frames (Frames 0-191). Each 
frame in turn is composed a pairof subframes, each including Channel A and Channel 
B data, along with one of three types of 4-bit preambles. An X preamble precedes each 
Channel A subframe (except at the beginning of the block), a Y preamble precedes 
each Channel B subframe and a Z preamble precedes each Channel A subframe at the 
beginning of the block (Pillay [0061]). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew to incorporate at least one transition 
located in said preamble sub-window of said timing window-and at least one transition 
located in said at least one data sub-window of said timing window as taught by Pillay to 
allow for the reduction of biphase errors when extracting and generating clock and 
control signals, wherein frames, subframes, and preambles are present within the time 
window (Pillay [0061]). 
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Re claim 20, Lew teaclies a bi-phase decoder for use in decoding a stream of 
AES-3 digital audio data, comprising: 

a decoder circuit coupled to receive a stream of AES-3 digital audio data, an 
estimated bit time for said stream of AES-3 digital audio data (Col. 4 lines 34-68 & Fig. 
2) and a fast clock, said fast clock having a frequency of about at least twenty times 
faster than a frequency of said stream of AES-3 digital audio data (Lew Col. 5 line 44- 
53); 

a data store (Col. 8 line 24-42) coupled to said decoder circuit, said data store 

receiving sub frames of digital audio data extracted, from said stream of AES-3 digital 
(Col. 4 lines 34-68 & Fig. 2) audio data by said decoder circuit (Fig. 1 & Col. 3 line 55- 
65); 

said decoder circuit extracting sub frames of said digital audio data by 
constructing a timing window from said estimated bit time (Col. 4 lines 34-68 & Fig. 2), 
sampling said stream of AES-3 digital audio data using said fast clock (Lew Col. 5 line 
44-53) and applying said sampled stream of AES-3 digital audio data to said timing 
window to identify transitions (Col. 6 lines 16-32), in said sampled stream of AES-3 
digital audio data, indicative of preambles of said sub frames of digital audio data. 

However, Lew fails to teach extracting sub frames of said digital audio data by 
constructing a timing window 

sampled stream of AES-3 digital audio data, indicative of preambles of said sub 
frames of digital audio data 
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Pillay teaches a method of extracting a clock from a biphase encoded bit stream 
includes the step of detecting a stream of samples each having a sample size 
measured between consecutive bit phase transitions. A sample length is determined for 
each sample, the sample length approximating a number of least common multiples in 
the corresponding sample size. A preamble is detected from the sample lengths of a 
sequence of the samples and decoded to determine an expected logic level of the clock 
following a transition at an expected clock edge. The expected level of the clock is 
gated with the biphase encoded data to generate a control signal in advance of the 
opening of the time window (Pillay Abstract). 

Additionally, Pillay teaches a logic level transition of the data bit at each active 
edge of the bit clock, otherwise it is considered to be an error in the encoding scheme. 
(With the exception that preambles by definition include one or more of such biphase 
errors.) FIG. 7 illustrates a portion of a typical AES/EBU (SPD/IF) data stream. The 
stream is divided into blocks each composed of 1 92 frames (Frames 0-1 91 ). Each 
frame in turn is composed a pair of subframes, each including Channel A and Channel 
B data, along with one of three types of 4-bit preambles. An X preamble precedes each 
Channel A subframe (except at the beginning of the block), a Y preamble precedes 
each Channel B subframe and a Z preamble precedes each Channel A subframe at the 
beginning of the block (Pillay [0061]). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew to incorporate extracting sub frames of 
said digital audio data by constructing a timing window and sampling a stream of AES-3 
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digital audio data, indicative of preambles of said sub frames of digital audio data as 
taught by Pillay to allow for the reduction of biphase errors when extracting and 
generating clock and control signals, wherein frames, subframes, and preambles are 
present within the time window (Pillay [0061]). 

Re claim 22, Lew teaches the apparatus of claim 21 , and further comprising a bit 
time estimator circuit having an input coupled to receive said stream of AES-3 digital 
(Col. 4 lines 34-68 & Fig. 2) audio data and an output coupled to said decoder circuit 
(Col. 4 lines 34-68 & Fig. 3), said bit time estimator determining said estimated bit time 
for output to said decoder circuit (Col. 4 lines 34-68 & Fig. 3). 

5. Claims 5-7 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Lew US 5245667 A (hereinafter Lew) in view of Pillay et al. US 20030195645 A1 
(hereinafter Pillay) and Gillick et al US 4837831 A (hereinafter Gillick) and further 
in view of Akagiri US 5490130 (hereinafter Akagiri). 

Re claims 5-7, Lew teaches the method of claim 4, wherein said timing window is 
construed such that said at least one data sub-window includes a first data sub-window 
(Col. 4 lines 34-68 & Fig. 2). 

However, Lew in view of Pillay and Gillick fails to teach a sub-window which 
extends from about % times said estimated bit time to about % times said estimated bit 
time and a second data sub window which extends from about % times said estimated 
bit time to about 1 % times said estimated bit time. 
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NOTE: The use of about is construed to be an estimate with no fixed range or 
deviation limitation, where 1 .5 or even 2 can be considered close to .5 without a specific 
variation constraint and is therefore construed to functionally equivalent to a scaling of 
.25 or .5. 

Akagiri teaches that frequency range signals is then divided in time into blocl<s to 
which blocl< floating processing and orthogonal transform processing is applied. The 
block length decision circuit 45 adaptively determines the block length of the blocks in 
each of the frequency ranges according to dynamic characteristics of the digital input 
signal. The digital input signal is notionally divided in time into frames. Then, after the 
digital input signal is divided into plural frequency range signals, each frequency range 
signal is divided into the blocks in which the frequency range signal will be orthogonally 
transformed. Each block corresponds to a frame or an integral fraction (e.g., 1/2, 1/4) of 
a frame. Thus, the maximum block length in which each frequency range signal is 
orthogonally transformed is equal to the frame length (Akagiri Col. 15 lines 7-20). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew in view of Pillay and Gillick to 
incorporate a window or frame that is scaled by about .25 to 1 .25 as taught by REFB to 
allow for orthogonal constraints to be met during the transformation of a signal from the 
time to frequency range as to not overlap data between adjacent frames by 
extending/shortening a frame (Akagiri Col. 15 lines 7-20). 
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6. Claims 9-10 and 19 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Lew US 5245667 A (hereinafter Lew) in view of Pillay et al. US 
20030195645 Al (hereinafter Pillay) and Gillick et al US 4837831 A (herein after 
Gillick) and further in view of Tackin US 7180892 (hereinafter Tackin). 

Re claims 9-10 and 19, Lew teaches the method of claim 18, and further 

comprising: 

identifying transitions in said serialized stream of digital audio data which occur 
within said constructed bit window (Col. 4 lines 34-68 & Fig. 2), 

However, Lew fails to teach the time separating a set of successive identified 
transitions being a measurement of said estimated bit time. 

Gillick teaches that the acquisition of multiple utterances of each vocabulary 
word, method 100 advances to step 106. This step performs a plurality of substeps 
1 08, 11 0, and 1 1 2 for each word in the vocabulary. The first of these substeps, step 
108, itself comprises two substeps, 114 and 116, which are performed for each 
utterance of each word. Step 114 finds an anchor for each utterance, that is, the first 
location in the utterance at which it has attained a certain average threshold amplitude. 
Step 116 calculates five smoothed frames for each utterance, positioned relative to its 
anchor. Additionally, Gillick teaches in reference to figures 4 and 5, where FIG. 4 
schematically represents how such smoothed frames are calculated. A smoothed 
frame 118 is calculated from five individual frames 104A-104E, of the type described 
above with regard to FIG. 3. According to this process, each pair of successive 
individual frames 104 are averaged, to form one second level frame 120. Thus the 
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individual frames 104A and 104B are averaged to form the second level frame 120A, 
and the individual frames 104B and 104C are averaged to form the second level frame 
120B, and so on, as is shown in FIG. 4 (Gillick Col. 8 lines 5-37). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew in view of Pillay to incorporate time 
separating a set of successive identified transitions being a measurement of said 
estimated bit time as taught by Gillick to allow for the detection and location of multiple 
utterances within an audio signal, whererin the repetition of an utterance is not limited to 
adjacent words and can have a separation between the repeated words, where 
transitions from a repeated word to the next can be smoothed and the location of 
occurrence can be stored in memory as to detect a repeating segment of multiple words 
(i.e. chorus) (Gillick Col. 8 lines 5-37). 

However, Lew in view of Pillay and Gillick fails to teach estimating minimum and 
maximum bit window times; constructing a bit window from said minimum and maximum 
bit window 

determining said estimated bit time from a running average of plural 
measurements of said estimated bit time (Tackin Col. 26 line 53 - Col. 27 line 9). 

Tackin teaches voice synchronizer should operate with or without sequence 
numbers, time stamps, and SID packets. The voice synchronizer should also operate 
with voice packets arriving out of order and lost voice packets. In addition, the voice 
synchronizer preferably provides a variety of configuration parameters which can be 
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specified by the host for optimum performance, including minimum and maximum target 
holding time. With these two parameters, it is possible to use a fully adaptive jitter 
buffer by setting the minimum target holding time to zero msec and the maximum target 
holding time to 500 msec (or the limit imposed due to memory constraints). Although 
the preferred voice synchronizer is fully adaptive and able to adapt to varying networl< 
conditions, those skilled in the art will appreciate that the voice synchronizer can also be 
maintained at a fixed holding time by setting the minimum and maximum holding times 
to be equal. These estimates are periodically quantized and transmitted in a SID packet 
by the comfort noise estimator (usually at the end of a talk spurt and periodically during 
the ensuing silent segment, or when the background noise parameters change 
appreciably). The comfort noise estimator 81 should update the long running averages, 
when necessary, decide when to transmit a SID packet, and quantize and pass the 
quantized parameters to the packetization engine 78 (Tackin Col. 36 lines 15-31). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Lew in view of Pillay and Gillick to 
incorporate estimating minimum and maximum bit window times; constructing a bit 
window from said minimum and maximum bit window and determining said estimated 
bit time from a running average of plural measurements of said estimated bit time as 
taught by Tackin to allow for a maximum and minimum time to produce a buffer having 
a reduced amount of jitter when extracting and quantizing information from a signal, 
wherein the use a running/moving average for time based data can be smoothed. 
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reducing the number of fluctuations based on a maximum and minimum period (Tacl<in 
Col. 36 lines 15-31). 

Conclusion 

7. The prior art made of record and not relied upon Is considered pertinent to 
applicant's disclosure. US 6405093 B1, US 6628999 B1, US 6782300 B2. 

Any Inquiry concerning this communication or earlier communications from the 
examiner should be directed to Michael C. ColuccI whose telephone number is (571)- 
270-1847. The examiner can normally be reached on 9:30 am - 6:00 pm, Monday- 
Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 

supervisor, Richemond Dorvil can be reached on (571)-272-7602. The fax phone 
number for the organization where this application or proceeding Is assigned is 571- 
273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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